A Real-time Music Scene Description System: Detecting Melody and Bass Lines in Audio Signals

نویسندگان

  • Masataka Goto
  • Satoru Hayamizu
چکیده

This paper describes a predominant-pitch estimation method that enables us to build a realtime system detecting melody and bass lines as a subsystem of our music scene description system. The purpose of this study is to build such a real-time system that is practical from the engineering viewpoint, that gives suggestions to the modeling of music understanding, and that is useful in various applications. Most previous pitch-estimation methods premised either a single-pitch sound with aperiodic noises or a few musical instruments and had great difficulty dealing with complex audio signals sampled from compact discs, especially discs recording jazz or popular music with drum-sounds. Our method can estimate the most predominant fundamental frequency (F0) in such signals containing sounds of various instruments because it does not rely on the F0’s frequency component, which is often overlapped by other sounds’ components, and instead estimates the F0 by using the Expectation-Maximization algorithm on the basis of harmonics’ frequency components within an intentionally limited frequency range. It also uses a multiple-agent architecture to stably track the temporal trajectory of the F0. Experimental results show that the system is robust enough to estimate the predominant F0s of the melody and bass lines in real-world audio signals.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals

In this paper, we describe the concept of music scene description and address the problem of detecting melody and bass lines in real-world audio signals containing the sounds of various instruments. Most previous pitch-estimation methods have had difficulty dealing with such complex music signals because these methods were designed to deal with mixtures of only a few sounds. To enable estimatio...

متن کامل

F0 Estimation of Melody and Bass Lines in Real-world Musical Audio Signals

This paper describes a method for estimating the fundamental frequency (F0) of melody and bass lines in monaural audio signals containing sounds of various instruments. Most previous methods premised mixtures of a few sounds and had great difficulty dealing with audio signals sampled from compact discs. Our method does not rely on the unreliable F0’s component and obtains the most predominant F...

متن کامل

A robust predominant-F0 estimation method for real-time detection of melody and bass lines in CD recordings

This paper describes a robust method for estimating the fundamental frequency (F0) of melody and bass lines in monaural realworld musical audio signals containing sounds of various instruments. Most previous F0-estimation methods had great difficulty dealing with such complex audio signals because they were designed to deal with mixtures of only a few sounds. To make it possible to estimate the...

متن کامل

Music scene description project: Toward audio-based real-time music understanding

• Human auditory system does not extract each individual audio signal Even if a mixture cannot be separated, that the mixture includes certain components can be understood • Untrained listeners understand music without mentally representing audio signals as scores Even if we could derive separated signals and musical notes, it is still difficult to obtain high-level descriptions like melody and...

متن کامل

A Predominant-F0 Estimation Method for Real-world Musical Audio Signals

In this paper we describe a robust method, called PreFEst, for estimating the fundamental frequency (F0) of melody and bass lines in monaural audio signals containing sounds of various instruments. Most previous F0-estimation methods have difficulty dealing with such complex audio signals because they are designed for mixtures of only a few sounds. Without assuming the number of sound sources, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999